** Data reading and variable selection from raw data
** UK National Survey of Sexual Attitudes and Lifestyle 2010-2012


** 01. Reading data **

cap log close
clear all
cd /*insert you work directory here*/
use /*read your data here*/
set more off
numlabel, add

** 02. Consructing year and country variables **

ge year=2010
lab var year "survey year"

ge country=826
lab var country "ISO country code"
//uk: 826 


** 03. ID variables **

drop pid
ge pid=sin2
lab var pid "person id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=rsex
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge age=dage
lab var age "age"  /* at interview */

ge birthyr=rdoby
lab var birthyr "year of birth"


** 05. Siblings: inc all types of siblings **

recode sibbro-sibstsis (99=.)

ge nbro=sibbro+sibadbr+sibhabr+sibstbr
ge nsis=sibsis+sibadsis+sibhasis+sibstsis
ge nsibs=nbro+nsis

lab var nbro "number of brothers"
lab var nsis "nubmer of sisters"
lab var nsibs "number of siblings"

* birth order: only rough position is available (when growing up)
ge birthorder=sibpos2
recode birthorder (1=1) (3=2) (2=3) (9=.)
replace birthorder=1 if nsibs==0 
replace birthorder=. if nsibs>0 & sibpos2==-1

lab var birthorder "birth order (3 cats)"
lab def birthorder 1 "oldest" 2 "in between" 3 "youngest"
lab val birthorder birthorder


** 06. Own education **

drop educ educ2

rename exams educ2
rename exams2 educ

lab var educ "highest educ qualification (4 cats)"
lab var educ2 "highest educ qualification (detailed)"


** 07. Parents' education: Father and/or Mother **

// No parents' education available


** 08. Own occupation **

rename rsoc2010_9 occ /* 2010 SOC classification 9 cats */

lab var occ "R's own occupation (2010 SOC)"


** 09. Parents' occupation **

rename par1occ paocc
rename par2occ paocc2
rename par3occ paocc3

lab var paocc "parents occupation at 16"
lab var paocc2 "parents employee? employer?"
lab var paocc3 "parents manager?"

rename parsc3 paclass
lab var paclass "parent(s) social class"


** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text


** Data reading and variable selection from raw data
** UK National Survey of Sexual Attitudes and Lifestyle 2010-2012


** Sex **
tab sex,m

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nbro nsis birthorder, d


** R's Own Education & Occupation **
tab1 educ educ2 occ ,m

** Parental Occupation **
tab1 paocc* paclass,m

log close


** 11. Keep the identified variables only

keep year country pid ///
	 sex age birthyr ///
	 nsibs nbro nsis birthorder ///
	 educ educ2 occ paclass paocc* ///
	 psu_scrm total_wt strata

** 13. Create educational years variable **

*rename educational level variable to educ_cat*
rename educ2 educ_cat

ge educ_yrs = .
replace educ_yrs = 8.433 if educ_cat == -1
replace educ_yrs = 13.737 if educ_cat == 1
replace educ_yrs = 12.211 if educ_cat == 2
replace educ_yrs = 12.211 if educ_cat == 3
replace educ_yrs = 12.035 if educ_cat == 4
replace educ_yrs = 11.280 if educ_cat == 5
replace educ_yrs = 11.280 if educ_cat == 6
replace educ_yrs = 11.280 if educ_cat == 7
replace educ_yrs = 11 if educ_cat == 8
replace educ_yrs = 11 if educ_cat == 9
replace educ_yrs = 10.625 if educ_cat == 10
replace educ_yrs = 10.625 if educ_cat == 11
replace educ_yrs = 10 if educ_cat == 12
replace educ_yrs = 10 if educ_cat == 13
replace educ_yrs = 11.333 if educ_cat == 14
replace educ_yrs = 8.381 if educ_cat == 15
replace educ_yrs = 8.381 if educ_cat == 99

lab var educ_yrs "Respondent's years of education"


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace

